James Jordan
April 14, 2022
For our project we have chosen to analyze government COVID-19 related data. As we all know, COVID-19 has impacted just about everyone around the world. Over the past two years, COVID-19 has filled hospitals worldwide and taken countless lives in the process. We want to look at the data to:
Find trends in government responses in response to the pandemic.
Discover factors that contribute to COVID-19 deaths.
Explore the reproductive rate of the virus.
Look at how the pandemic impacted the growth rates of economies around the world.
For our data analysis, we are utilizing two large datasets. The first is provided by Oxford titled Oxford COVID Government Response Variables. The second dataset is provided by Bloomberg, for which they have created resiliency scores for the 53 top economies in the world.
On top of these two main datasets, we will also be utilizing GDP growth rate data as well as Google COVID-19 Community Mobility Report to aid in our regression and analysis process.
## ── Attaching packages ─────────────────────────────────────── tidyverse 1.3.1 ──
## ✓ ggplot2 3.3.6 ✓ purrr 0.3.4
## ✓ tibble 3.1.6 ✓ dplyr 1.0.8
## ✓ tidyr 1.2.0 ✓ stringr 1.4.0
## ✓ readr 2.1.2 ✓ forcats 0.5.1
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## x dplyr::filter() masks stats::filter()
## x dplyr::lag() masks stats::lag()
oxford <- "https://raw.githubusercontent.com/OxCGRT/covid-policy-tracker/master/data/OxCGRT_latest.csv"
oxford <- read_csv(oxford)## Rows: 291987 Columns: 61
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (7): CountryName, CountryCode, RegionName, RegionCode, Jurisdiction, V2...
## dbl (53): Date, C1_School closing, C1_Flag, C2_Workplace closing, C2_Flag, C...
## lgl (1): M1_Wildcard
##
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
#Path name for the dataset
country_info <- "https://covid.ourworldindata.org/data/owid-covid-data.csv"
#Read csv file
country_info <- read_csv(country_info)## Rows: 185435 Columns: 67
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (4): iso_code, continent, location, tests_units
## dbl (62): total_cases, new_cases, new_cases_smoothed, total_deaths, new_dea...
## date (1): date
##
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
country_info <- country_info %>%
rename(country = location ) %>%
select(country, everything())
rmarkdown::paged_table(country_info)-We plan to build multiple linear regression models and one unsupervised learning method utilizing clustering.
-The first regression model’s dependent variable will be COVID-19 deaths per capita.
-The second regression model’s dependent variable is the reproductive rate.
-The third regression involves the economic impact and our dependent variable will be changes in quarterly GDP.
-The unsupervised learning method will be performed on the Bloomberg dataset to find countries who are similar to each other in their government response to the pandemic.
We are going to make visualizations using ggplot, plotly, and others to increase the readability of our findings. It is important to become educated on the factors that affect covid deaths in the event another global pandemic becomes present. We hope to display trends that may not be apparently obvious in the hopes that this will aid in the current pandemic and future health problems to come.
-Oxford’s goal is to track COVID-19 policy data consistently and use it to compare policy responses to COVID-19.
-They collect data on 180 different countries and 23 different indicators.
-The data collection began on January 1st, 2020 which marks the very early stages of the global pandemic.
-There are 21 live indicators that are imputed into the dataset daily.
-Containment and Closure Policies: Marked as C1-C8 on the dataset.
-Economic Policies: E1-E4
-Health System Policies: H1-H8
-Vaccine Policies: V1-V4
-There are five kinds of data utilized in the dataset.
-Ordinal: On a simple scale of severity.
-Numeric: Specific number typically in U.S. dollars.
-Text: An open ended free response.
-Categorical: Range of eligible options to select and occasionally rank.
-Binary: Present (1) or absent (0).
-Overall Government Response Index: Calculated using ordinal indicators
-Containment and Health Index: Combines lockdown restrictions and closure measures with health variables such as testing policy, contact tracing, and others. Calculated using all ordinal containment and closure policy indicators and health system policy indicators.
-Stringency Index: Measures the strictness of lockdown style. Calculated using all ordinal containment and closure policy indicators, plus an indicator recording public information campaigns.
-Economic Support Index: Measures income support and debt relief. Calculated using all ordinal economic policy indicators
-Risk of Openness Index: Based on the recommendations set out by the World Health Organization of measures that should be put in place before COVID-19 response policies can be safely relaxed.
-Policy indices are averages of the individual component indicators.
-index=1k∑kj=1Ij
-Where k is the number of component indicators in an index and Ij is the sub-index score for an individual indicator.
So far, Oxford has incorporated data for Brazilian states, Canadian provinces and territories, Chinese provinces, UK developed nations, and U.S. States.
-3 main usage for the OxCGRT data:
1.Describe all government responses relevant to a certain country
2.Describe policies put in place by a given level and lower levels of government
3.Compare government responses across different levels of government
Bloomberg created a “Covid Resilience Ranking” system for the 53 largest economies in the world
Each country’s rank is based on their success with controlling the virus with the least amount of social economic disruptions
Data has been collected since June 2021
-There are 11 indicies considered when ranking each Country
-Vaccine Doses Per 100 -Lockdown Severity -Flight Capacity -Vaccinated Travel Routes
-1-Month Cases per 100k -3-Month Case Fatality Rate -Total Deaths Per 1M
```